Database Citation in Full Text Biomedical Articles
نویسندگان
چکیده
Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services.
منابع مشابه
Database citation in supplementary data linked to Europe PubMed Central full text biomedical articles
BACKGROUND In this study, we present an analysis of data citation practices in full text research articles and their corresponding supplementary data files, made available in the Open Access set of articles from Europe PubMed Central. Our aim is to investigate whether supplementary data files should be considered as a source of information for integrating the literature with biomolecular databa...
متن کاملHubMed: a web-based biomedical literature search interface
HubMed is an alternative search interface to the PubMed database of biomedical literature, incorporating external web services and providing functions to improve the efficiency of literature search, browsing and retrieval. Users can create and visualize clusters of related articles, export citation data in multiple formats, receive daily updates of publications in their areas of interest, navig...
متن کاملPub-Med-dot-com, here we come!
As of April 8, 2016, articles in Neurology® Genetics can be searched using PubMed. Launched in 1996, PubMed is a search engine that accesses citations and abstracts of more than 26 million articles. Its primary sources include the MEDLINE database, which was started in the 1960s, and biomedical and life sciences journal articles that date back to 1946. In addition, PubMed accesses other sources...
متن کاملTagging gene and protein names in full text articles
Current information extraction efforts in the biomedical domain tend to focus on finding entities and facts in structured databases or MEDLINE abstracts. We apply a gene and protein name tagger trained on Medline abstracts (ABGene) to a randomly selected set of full text journal articles in the biomedical domain. We show the effect of adaptations made in response to the greater heterogeneity o...
متن کاملInvestigation on Full-Text Databases Cited in LIS
Background and Aim: The main objective of this research was to investigate the use of full-text databases in the LIS theses of Tehran State Universities within the years 2005 and 2009. Method: For this purpose, the total of 9952 citations related to 172 existing theses in the academic central libraries were studied. The data collected were analyzed by the bibliometrics and citation analysis met...
متن کامل